Skip to content

Conversation

@loewenheim
Copy link
Contributor

@loewenheim loewenheim commented May 16, 2025

This enables filtering for SessionUpdate and SessionAggregate items. In the course of this, it also

  • adds Filterable and Getter implementations to SessionUpdate and SessionAggregates;
  • introduces a SessionProcessingConfig struct to bundle all the auxiliary data needed in process_session and process_session_aggregates (modeled on ReplayProcessingConfig);
  • slightly refactors how session processing is called.

Aside from this not having tests, my main open questions are around the Getter impls. To wit:

  • Is there some sort of schema for what the path segments should be called? Is it just the names of the fields they return? Does anything break if they aren't chosen correctly?
  • Are the names for the "roots" of the two types appropriate? Should they be closer to the actual type names ("session_update"/"session_aggregates")?
  • Are there any fields I made addressable that shouldn't be?
  • In practical terms, do the Getter impls change anything about the way filtering works? I believe Filterable already returns the fields we actually care about.

Closes RELAY-41.

Comment on lines 2026 to 2027
&self.inner.global_config.current(),
&config,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we are getting passed the config from outside, but reading the global_config from self.inner. I did it this way because that's how it is in other processors (e.g. replays), but I don't know whether there's a good reason for it or it's a historical accident.

@loewenheim loewenheim linked an issue May 16, 2025 that may be closed by this pull request
@loewenheim loewenheim changed the title Sebastian/filter sessions feat(inbound filters): Enable filtering sessions May 16, 2025
@loewenheim loewenheim force-pushed the sebastian/filter-sessions branch from b97cba9 to 07b028b Compare May 28, 2025 08:56
@loewenheim
Copy link
Contributor Author

I've now reduced the Getter impl for SessionUpdate to the bare minimum (release, environment, ip_addr, user_agent).

@Dav1dde
Copy link
Member

Dav1dde commented May 30, 2025

In practical terms, do the Getter impls change anything about the way filtering works? I believe Filterable already returns the fields we actually care about.

That's the question, do you need even any fields filterable (aka a Getter impl that can get anything)?

Is there some sort of schema for what the path segments should be called? Is it just the names of the fields they return? Does anything break if they aren't chosen correctly?

Usually the JSON path of the fields, since sessions are more like metrics, this may not make sense.

@loewenheim loewenheim marked this pull request as ready for review May 30, 2025 11:57
@loewenheim loewenheim requested a review from a team as a code owner May 30, 2025 11:57
@loewenheim loewenheim force-pushed the sebastian/filter-sessions branch from c1201d1 to 3ca5562 Compare May 30, 2025 11:58
"sid": "8333339f-5675-4f89-a9a0-1c935255ab59",
"timestamp": timestamp.isoformat(),
"started": timestamp.isoformat(),
"attrs": {"release": "[email protected]", "ip_address": "1.2.3.0/24"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this IP filtering match the one we have for errors? I think there was something like, that we do not filter on user ips with that denylist, but actual ingestion IPs. We should follow the same here, if it filters on the user ip, this would be a different behaviour.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation of Filterable::ip_addr describes it as returning the client IP:

    /// The IP address of the client that sent the data.
    fn ip_addr(&self) -> Option<&str>;

And the impl for Event also seems to use the client IP:

    fn ip_addr(&self) -> Option<&str> {
        let user = self.user.value()?;
        Some(user.ip_address.value()?.as_ref())
    }

@loewenheim
Copy link
Contributor Author

@Dav1dde I added a paragraph about client and user IPs to the should_filter docstring, lmk if it sounds ok.

@loewenheim loewenheim merged commit 04f531e into master Jun 3, 2025
28 checks passed
@loewenheim loewenheim deleted the sebastian/filter-sessions branch June 3, 2025 08:53
loewenheim added a commit to getsentry/sentry-docs that referenced this pull request Sep 15, 2025
Filtering of sessions was implemented in
getsentry/relay#4745. I'm leaving the comment
about minidumps in place because I'm honestly not sure what it's
supposed to mean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Have option to exclude web crawlers from session tracking

4 participants